# Clipped Gradient Methods 

We leverage existing codebases which implement WGAN-GP and StyleGAN2, specifically:

- https://github.com/w86763777/pytorch-gan-collections
- https://github.com/NVlabs/stylegan3

As a result, instead of tarballing these full repos with our modifications applied directly (which
makes it difficult to see what changes were made without some digging), we provide two patches 
which contain all of our modifications to these repos that can be applied with a single git command
(instructions are provided here).
We also believe this gives more credit and visability to the original authors.


# WGAN-GP

Our patch implements the functionality to train a model with normalized SGDA algorithm, and the 
grafted Ada-nSGDA, AdaDir methods.

## Instructions

1. Download the reference codebase, and checkout a specific commit that we developed from:

    ```
    git clone git@github.com:w86763777/pytorch-gan-collections.git
    cd pytorch-gan-collections
    git checkout 860e90bd882eb5737103de55bcf0e977d6952343
    ```

2. Apply our patch, which contains code for gradient clipping and generating histograms, and setup 
    the virtual environment:

    ```bash
    git apply /PATH/TO/adaptive_pytorch-gan-collections.patch

    # setup the virtualenv (wherever you desire) and activate
    virtualenv ~/wgan-venv
    source ~/wgan-venv/bin/activate

    # install deps
    pip install -U pip setuptools
    pip install -r requirements.txt
    pip install pytorch_gan_metrics --no-deps

    # if you have a newer version of torch and it doesn't automatically download cu113:
    # pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113

    # directory required for cached data
    mkdir stats
    ```
    
    If you have issues with the `torch` version, just get a recent version of `torch` and `torchvision` and make sure it's compatible with your version of cuda.

    We primarily work with CIFAR and STL10, which will automatically be downloaded when you run an experiment. 
    You must however have the FID stats required to compute the FID.
    
    This can be obtained by following the instructions 
    in the [original repo](https://github.com/w86763777/pytorch-gan-collections).
    Download [`cifar10.train.npz`](https://drive.google.com/drive/folders/1UBdzl6GtNMwNQ5U-4ESlIer43tNjiGJC) to `./stats`.
    
    We also provide the two require stats files in the supplementary material; copy these files to `stats/`.
    Note that `stl10.unlabeled.32.npz` created using [`pytorch_gan_metrics`](https://github.com/w86763777/pytorch-gan-metrics),
    and modifying [calc_metrics.py](https://github.com/w86763777/pytorch-gan-metrics/blob/master/pytorch_gan_metrics/calc_metrics.py)
    and associated files to support resizing the loaded dataset. 
    This is to convert STL10 from 48x48 to 32x32.


3. You can run our experiments using the following command:

    ```bash
    # CIFAR
    # --------------------------------------------------

    ## Adam/Ada-nSGDA; optimal LR for both methods is 0.0002
    OPTIMIZER="adam"  # or "ada_normsgd"
    python wgangp.py --arch=res32 --batch_size=64 --loss=was --num_images=50000 \
        --dataset=cifar10  --fid_cache=./stats/cifar10.train.npz --logdir=./logs/cifar \
        --record --sample_step=500 --sample_size=64 --total_steps=100000 --z_dim=128 \
        --seed=0 \
        --alpha=10 \
        --opt=${OPTIMIZER} \
        --lr_D=0.0002 \
        --lr_G=0.0002 \
        --desc='reproduce'

    ## nSDGA; optimal LR for nSGDA for both CIFAR10 and STL10 is 0.2
    python wgangp.py --arch=res32 --batch_size=64 --loss=was --num_images=50000 \
        --dataset=cifar10  --fid_cache=./stats/cifar10.train.npz --logdir=./logs/cifar \
        --record --sample_step=500 --sample_size=64 --total_steps=100000 --z_dim=128 \
        --seed=0 \
        --alpha=10 \
        --opt=normsgd \
        --lr_D=0.2 \
        --lr_G=0.2 \
        --desc='reproduce'   
 

    # STL10
    # --------------------------------------------------

    ## Adam/Ada-nSGDA; optimal LR for both methods is 0.0002
    OPTIMIZER="adam"  # or "ada_normsgd"
    python wgangp.py --arch=res32 --batch_size=64 --loss=was --num_images=50000 \
        --dataset=stl10 --fid_cache=./stats/stl10.unlabeled.32.npz --logdir=./logs/stl10 \
        --record --sample_step=500 --sample_size=64 --total_steps=100000 --z_dim=128 \
        --seed=0 \
        --alpha=10 \
        --opt=${OPTIMIZER} \
        --lr_D=0.0002 \
        --lr_G=0.0002 \
        --desc='reproduce'

    ## nSDGA; optimal LR for nSGDA for both CIFAR10 and STL10 is 0.2
    python wgangp.py --arch=res32 --batch_size=64 --loss=was --num_images=50000 \
        --dataset=stl10 --fid_cache=./stats/stl10.unlabeled.32.npz --logdir=./logs/stl10 \
        --record --sample_step=500 --sample_size=64 --total_steps=100000 --z_dim=128 \
        --seed=0 \
        --alpha=10 \
        --opt=normsgd \
        --lr_D=0.2 \
        --lr_G=0.2 \
        --desc='reproduce'   
    
    ```

    The main parameters to experiment with are `--lr_G`/`--lr_D` (the generator and discriminator learning
    rates), and `--opt=(sgd|extrasgd)` which specifies what optimizer to use.
    `--desc` is just a helpful user-defined string which is appended to the output log directory as 
    tag - this has no impact on the training or evaluation.


# StyleGAN2

Our patch provides the necessary functionality to reproduce our experiments. 
Note that the nature of this code is highly experimental.

Ultimately, our implementation fundamentally amounts adding additional optimizers.
We experimented with a number of other techniques, and the code provided here reflects that.

All of the original StyleGAN2 code is property of Nvidia; ensure you respect their license agreement. 

## Instructions

1. Download the official StyleGAN**3** Nvidia repository, and checkout a specific commit that we developed from:

    ```
    git clone git@github.com:NVlabs/stylegan3.git
    cd stylegan3
    git checkout 407db86e6fe432540a22515310188288687858fa
    ```

2. Apply our patch, which contains code for gradient clipping and generating histograms, and setup the conda
    environment:

    ```bash
    git apply /PATH/TO/adaptive_stylegan3.patch
    conda env create -f environment.yml
    conda activate stylegan3
    ```

3. You must acquire the original FFHQ dataset (as instructed in the official Nvidia repository). 
    We then used Nvidia's provided `dataset_tool.py` script to resize the data to 128x128:

    ```bash
    python dataset_tool.py --source=/PATH/TO/ffhq/images1024x1024.zip \
        --dest=~/datasets/ffhq-128x128.zip \
        --resolution=128x128
    ```
    
    For LSUN, you can use the LSUN repo to download the Churches dataset and then once again use 
    Nvidia's provided dataset tool: 

    ```bash
    # download lsun repo
    git clone git@github.com:fyu/lsun.git

    # download the churches datast
    cd lsun
    python download.py -c churches_outdoor -o ../lsun

    # ... navigate back to the stylegan repo ...

    # unzip the dataset and use the stylegan dataset tool to convert it
    # --> pip install opencv-python lmdb 
    python dataset_tool.py --source /PATH/TO/lsun/church_outdoor_train_lmdb/ \
        --dest ../lsun_churches-128x128.zip --resolution=128x128
    ```

4. You can with the different methods and datasets:
    ```bash

    DATASET="/PATH/TO/ffhq-128x128.zip"  
    # DATASET="/PATH/TO/lsun_churches-128x128.zip"

    OPTIMIZER="adam"
    # OPTIMIZER="ada_norm_sgd"
    # OPTIMIZER="adadir"

    python train.py --outdir=./logs/ --data=${DATASET} --cfg=stylegan2 \
        --gpus=1 --batch=32 --gamma=0.1024 --map-depth=2 --cbase=16384 \
        --optimizer=${OPTIMIZER} \
        --glr=0.0025  \
        --dlr=0.0025  \
        --kimg=2600 
    ```

    The main parameters to experiment with are `--glr`/`--dlr` (the generator and discriminator learning
    rates), and `--opt=(adam|sgd|norm_sgd|layernorm_sgd|ada_norm_sgd|adadir)`. `--kimg` specifes how 
    many thousands (k) of images to train for, and `--desc` is just a helpful user-defined string 
    which is appended to the output log directory as a tag - this has no impact on the training or 
    evaluation.

